Overview

Dataset statistics

Number of variables9
Number of observations800
Missing cells0
Missing cells (%)0.0%
Duplicate rows8
Duplicate rows (%)1.0%
Total size in memory50.9 KiB
Average record size in memory65.2 B

Variable types

Numeric8
Boolean1

Alerts

Dataset has 8 (1.0%) duplicate rowsDuplicates
Total is highly correlated with HP and 5 other fieldsHigh correlation
HP is highly correlated with Total and 1 other fieldsHigh correlation
Attack is highly correlated with Total and 2 other fieldsHigh correlation
Defense is highly correlated with Total and 2 other fieldsHigh correlation
Sp. Atk is highly correlated with Total and 1 other fieldsHigh correlation
Sp. Def is highly correlated with Total and 2 other fieldsHigh correlation
Speed is highly correlated with TotalHigh correlation
Total is highly correlated with HP and 6 other fieldsHigh correlation
HP is highly correlated with TotalHigh correlation
Attack is highly correlated with TotalHigh correlation
Defense is highly correlated with Total and 1 other fieldsHigh correlation
Sp. Atk is highly correlated with Total and 1 other fieldsHigh correlation
Sp. Def is highly correlated with Total and 2 other fieldsHigh correlation
Speed is highly correlated with TotalHigh correlation
Legendary is highly correlated with TotalHigh correlation
Total is highly correlated with HP and 4 other fieldsHigh correlation
HP is highly correlated with TotalHigh correlation
Attack is highly correlated with TotalHigh correlation
Defense is highly correlated with TotalHigh correlation
Sp. Atk is highly correlated with TotalHigh correlation
Sp. Def is highly correlated with TotalHigh correlation
Total is highly correlated with HP and 6 other fieldsHigh correlation
HP is highly correlated with Total and 2 other fieldsHigh correlation
Attack is highly correlated with Total and 2 other fieldsHigh correlation
Defense is highly correlated with Total and 3 other fieldsHigh correlation
Sp. Atk is highly correlated with Total and 3 other fieldsHigh correlation
Sp. Def is highly correlated with Total and 2 other fieldsHigh correlation
Speed is highly correlated with Total and 1 other fieldsHigh correlation
Legendary is highly correlated with Total and 1 other fieldsHigh correlation

Reproduction

Analysis started2022-03-16 04:23:27.667980
Analysis finished2022-03-16 04:23:33.674632
Duration6.01 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

Total
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct200
Distinct (%)25.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean435.1025
Minimum180
Maximum780
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.4 KiB
2022-03-15T22:23:33.732682image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum180
5-th percentile250
Q1330
median450
Q3515
95-th percentile630
Maximum780
Range600
Interquartile range (IQR)185

Descriptive statistics

Standard deviation119.9630398
Coefficient of variation (CV)0.2757121362
Kurtosis-0.5074607103
Mean435.1025
Median Absolute Deviation (MAD)85
Skewness0.1525299234
Sum348082
Variance14391.13091
MonotonicityNot monotonic
2022-03-15T22:23:33.805745image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
60037
 
4.6%
40526
 
3.2%
58023
 
2.9%
50023
 
2.9%
30019
 
2.4%
49018
 
2.2%
52516
 
2.0%
49515
 
1.9%
33015
 
1.9%
48015
 
1.9%
Other values (190)593
74.1%
ValueCountFrequency (%)
1801
 
0.1%
1901
 
0.1%
1941
 
0.1%
1953
0.4%
1981
 
0.1%
2003
0.4%
2055
0.6%
2103
0.4%
2131
 
0.1%
2151
 
0.1%
ValueCountFrequency (%)
7803
 
0.4%
7702
 
0.2%
7201
 
0.1%
7009
1.1%
68013
1.6%
6704
 
0.5%
6601
 
0.1%
6401
 
0.1%
6351
 
0.1%
6342
 
0.2%

HP
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct94
Distinct (%)11.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean69.25875
Minimum1
Maximum255
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.4 KiB
2022-03-15T22:23:33.881810image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile35.95
Q150
median65
Q380
95-th percentile110
Maximum255
Range254
Interquartile range (IQR)30

Descriptive statistics

Standard deviation25.53466903
Coefficient of variation (CV)0.368685098
Kurtosis7.232078374
Mean69.25875
Median Absolute Deviation (MAD)15
Skewness1.568224376
Sum55407
Variance652.0193226
MonotonicityNot monotonic
2022-03-15T22:23:33.957375image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6067
 
8.4%
5063
 
7.9%
7057
 
7.1%
6546
 
5.8%
8043
 
5.4%
7543
 
5.4%
4538
 
4.8%
4038
 
4.8%
5537
 
4.6%
10032
 
4.0%
Other values (84)336
42.0%
ValueCountFrequency (%)
11
 
0.1%
101
 
0.1%
206
 
0.8%
252
 
0.2%
281
 
0.1%
3013
1.6%
311
 
0.1%
3515
1.9%
361
 
0.1%
371
 
0.1%
ValueCountFrequency (%)
2551
 
0.1%
2501
 
0.1%
1901
 
0.1%
1701
 
0.1%
1651
 
0.1%
1601
 
0.1%
1504
0.5%
1441
 
0.1%
1401
 
0.1%
1351
 
0.1%

Attack
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct111
Distinct (%)13.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean79.00125
Minimum5
Maximum190
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.4 KiB
2022-03-15T22:23:34.032439image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile30
Q155
median75
Q3100
95-th percentile136.2
Maximum190
Range185
Interquartile range (IQR)45

Descriptive statistics

Standard deviation32.45736587
Coefficient of variation (CV)0.4108462318
Kurtosis0.1697173149
Mean79.00125
Median Absolute Deviation (MAD)20
Skewness0.551613748
Sum63201
Variance1053.480599
MonotonicityNot monotonic
2022-03-15T22:23:34.104001image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10040
 
5.0%
6539
 
4.9%
5037
 
4.6%
8037
 
4.6%
8533
 
4.1%
6033
 
4.1%
7532
 
4.0%
7031
 
3.9%
9030
 
3.8%
5530
 
3.8%
Other values (101)458
57.2%
ValueCountFrequency (%)
52
 
0.2%
103
 
0.4%
151
 
0.1%
208
1.0%
221
 
0.1%
231
 
0.1%
241
 
0.1%
257
0.9%
271
 
0.1%
291
 
0.1%
ValueCountFrequency (%)
1901
 
0.1%
1851
 
0.1%
1803
 
0.4%
1702
 
0.2%
1653
 
0.4%
1641
 
0.1%
1605
0.6%
1552
 
0.2%
15011
1.4%
1471
 
0.1%

Defense
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct103
Distinct (%)12.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean73.8425
Minimum5
Maximum230
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.4 KiB
2022-03-15T22:23:34.182067image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile35
Q150
median70
Q390
95-th percentile130
Maximum230
Range225
Interquartile range (IQR)40

Descriptive statistics

Standard deviation31.18350056
Coefficient of variation (CV)0.422297465
Kurtosis2.72626036
Mean73.8425
Median Absolute Deviation (MAD)20
Skewness1.155912303
Sum59074
Variance972.4107071
MonotonicityNot monotonic
2022-03-15T22:23:34.253629image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7054
 
6.8%
5049
 
6.1%
6046
 
5.8%
8039
 
4.9%
4036
 
4.5%
6536
 
4.5%
9035
 
4.4%
10033
 
4.1%
4532
 
4.0%
5532
 
4.0%
Other values (93)408
51.0%
ValueCountFrequency (%)
52
 
0.2%
101
 
0.1%
154
 
0.5%
204
 
0.5%
231
 
0.1%
252
 
0.2%
281
 
0.1%
3014
1.8%
322
 
0.2%
331
 
0.1%
ValueCountFrequency (%)
2303
0.4%
2002
 
0.2%
1841
 
0.1%
1803
0.4%
1681
 
0.1%
1603
0.4%
1507
0.9%
1452
 
0.2%
1406
0.8%
1352
 
0.2%

Sp. Atk
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct105
Distinct (%)13.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean72.82
Minimum10
Maximum194
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.4 KiB
2022-03-15T22:23:34.328693image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile30
Q149.75
median65
Q395
95-th percentile131.05
Maximum194
Range184
Interquartile range (IQR)45.25

Descriptive statistics

Standard deviation32.72229417
Coefficient of variation (CV)0.4493586126
Kurtosis0.2978936607
Mean72.82
Median Absolute Deviation (MAD)20
Skewness0.7446624978
Sum58256
Variance1070.748536
MonotonicityNot monotonic
2022-03-15T22:23:34.400255image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6051
 
6.4%
4049
 
6.1%
6544
 
5.5%
5039
 
4.9%
5535
 
4.4%
4533
 
4.1%
7030
 
3.8%
3529
 
3.6%
8527
 
3.4%
9527
 
3.4%
Other values (95)436
54.5%
ValueCountFrequency (%)
103
 
0.4%
154
 
0.5%
208
 
1.0%
231
 
0.1%
242
 
0.2%
2511
1.4%
272
 
0.2%
291
 
0.1%
3024
3.0%
311
 
0.1%
ValueCountFrequency (%)
1941
 
0.1%
1803
 
0.4%
1751
 
0.1%
1703
 
0.4%
1652
 
0.2%
1602
 
0.2%
1591
 
0.1%
1542
 
0.2%
1509
1.1%
1454
0.5%

Sp. Def
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct92
Distinct (%)11.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean71.9025
Minimum20
Maximum230
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.4 KiB
2022-03-15T22:23:34.478321image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile32.95
Q150
median70
Q390
95-th percentile120
Maximum230
Range210
Interquartile range (IQR)40

Descriptive statistics

Standard deviation27.8289158
Coefficient of variation (CV)0.3870368318
Kurtosis1.628394057
Mean71.9025
Median Absolute Deviation (MAD)20
Skewness0.8540186115
Sum57522
Variance774.4485544
MonotonicityNot monotonic
2022-03-15T22:23:34.677492image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8052
 
6.5%
5050
 
6.2%
5547
 
5.9%
6544
 
5.5%
6043
 
5.4%
7040
 
5.0%
7540
 
5.0%
9036
 
4.5%
4535
 
4.4%
4030
 
3.8%
Other values (82)383
47.9%
ValueCountFrequency (%)
206
 
0.8%
231
 
0.1%
2511
1.4%
3020
2.5%
311
 
0.1%
321
 
0.1%
331
 
0.1%
341
 
0.1%
3518
2.2%
361
 
0.1%
ValueCountFrequency (%)
2301
 
0.1%
2001
 
0.1%
1602
 
0.2%
1543
 
0.4%
1507
0.9%
1402
 
0.2%
1381
 
0.1%
1354
0.5%
1309
1.1%
1291
 
0.1%

Speed
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct108
Distinct (%)13.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean68.2775
Minimum5
Maximum180
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.4 KiB
2022-03-15T22:23:34.753057image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile25
Q145
median65
Q390
95-th percentile115
Maximum180
Range175
Interquartile range (IQR)45

Descriptive statistics

Standard deviation29.06047372
Coefficient of variation (CV)0.4256229903
Kurtosis-0.2364366728
Mean68.2775
Median Absolute Deviation (MAD)21
Skewness0.3579332951
Sum54622
Variance844.5111327
MonotonicityNot monotonic
2022-03-15T22:23:34.823618image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5046
 
5.8%
6044
 
5.5%
7037
 
4.6%
6536
 
4.5%
3035
 
4.4%
8033
 
4.1%
4032
 
4.0%
9031
 
3.9%
10031
 
3.9%
5530
 
3.8%
Other values (98)445
55.6%
ValueCountFrequency (%)
52
 
0.2%
103
 
0.4%
159
1.1%
2015
1.9%
221
 
0.1%
234
 
0.5%
241
 
0.1%
2510
1.2%
284
 
0.5%
293
 
0.4%
ValueCountFrequency (%)
1801
 
0.1%
1601
 
0.1%
1504
0.5%
1453
0.4%
1402
 
0.2%
1352
 
0.2%
1306
0.8%
1281
 
0.1%
1271
 
0.1%
1261
 
0.1%

Generation
Real number (ℝ≥0)

Distinct6
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.32375
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.4 KiB
2022-03-15T22:23:34.884170image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q35
95-th percentile6
Maximum6
Range5
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.6612904
Coefficient of variation (CV)0.4998241145
Kurtosis-1.239575758
Mean3.32375
Median Absolute Deviation (MAD)2
Skewness0.01425810028
Sum2659
Variance2.759885795
MonotonicityIncreasing
2022-03-15T22:23:34.934713image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1166
20.8%
5165
20.6%
3160
20.0%
4121
15.1%
2106
13.2%
682
10.2%
ValueCountFrequency (%)
1166
20.8%
2106
13.2%
3160
20.0%
4121
15.1%
5165
20.6%
682
10.2%
ValueCountFrequency (%)
682
10.2%
5165
20.6%
4121
15.1%
3160
20.0%
2106
13.2%
1166
20.8%

Legendary
Boolean

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size928.0 B
False
735 
True
 
65
ValueCountFrequency (%)
False735
91.9%
True65
 
8.1%
2022-03-15T22:23:34.973747image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Interactions

2022-03-15T22:23:32.962021image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:29.033651image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:29.653683image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:30.151109image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:30.664049image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:31.232038image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:31.748480image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:32.286943image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:33.025575image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:29.106213image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:29.718739image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:30.217167image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:30.728605image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:31.297093image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:31.818040image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:32.351497image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:33.087128image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:29.169768image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:29.778290image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:30.280221image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:30.790158image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:31.361147image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:31.876090image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:32.413550image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:33.152184image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:29.243331image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:29.841845image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:30.345776image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:30.854713image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:31.427705image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:31.942647image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:32.488615image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:33.211235image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:29.305884image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:29.900895image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:30.408831image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:30.911762image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:31.489758image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:32.005200image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:32.550168image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:33.278794image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:29.372442image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:29.965450image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:30.474387image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:31.050380image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:31.556817image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:32.093275image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:32.618227image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:33.342848image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:29.430992image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:30.024001image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:30.534439image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:31.107430image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:31.616366image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:32.162835image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:32.678779image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:33.409906image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:29.496549image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:30.088556image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:30.600495image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:31.171484image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:31.682924image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:32.225389image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-15T22:23:32.898967image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-03-15T22:23:35.009277image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-03-15T22:23:35.090347image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-03-15T22:23:35.172417image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-03-15T22:23:35.250484image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-03-15T22:23:33.524003image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-03-15T22:23:33.637100image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

TotalHPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
03184549496565451False
14056062638080601False
2525808283100100801False
362580100123122120801False
43093952436050651False
54055864588065801False
6534788478109851001False
763478130111130851001False
863478104781591151001False
93144448655064431False

Last rows

TotalHPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
7902454030354540556False
79153585708097801236False
7926801261319513198996True
7936801261319513198996True
7946001081001218195956True
79560050100150100150506True
796700501601101601101106True
7976008011060150130706True
7986808016060170130806True
7996008011012013090706True

Duplicate rows

Most frequently occurring

TotalHPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary# duplicates
45205065107105107864False5
13165053485348645False3
349875986398631015False3
02055035552525153False2
246674487683811046False2
55807911570125801115True2
6580917290129901085False2
76801261319513198996True2